Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize agg with collation & Optimize mem utils #5834

Merged
merged 21 commits into from
Sep 16, 2022

Conversation

solotzg
Copy link
Contributor

@solotzg solotzg commented Sep 9, 2022

What problem does this PR solve?

Issue Number: ref #5294

What is changed and how it works?

  • optimize agg with collation
    • add ITiDBCollator::compareFastPath and ITiDBCollator::sortKeyFastPath to check bin collation as fast path
    • add std::unique_ptr<AggregationMethodMultiStringNoCache<AggregatedDataWithStringKey>> multi_key_string; to handle agg functions whose keys are all strings.
  • use inline_memcpy(libs/libcommon/include/common/memcpy.h) to accelerate memory copy for small buff
  • replace default memcpy with __folly_memcpy(libs/libmemcpy/folly/memcpy.S)
  • replace default memory comparison functions of StringRef with internal implementation.

Benchmark

ENV

  • tpch-100
  • 1 tiflash
  • original a8c8cb1

Query

  • limit cpu up to 2000%
  • SQL: tpch-100 q1
select     l_returnflag,     l_linestatus,     sum(l_quantity) as sum_qty,     sum(l_extendedprice) as sum_base_price,     sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,     sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,     avg(l_quantity) as avg_qty,     avg(l_extendedprice) as avg_price,     avg(l_discount) as avg_disc,     count(*) as count_order from     lineitem where     l_shipdate <= date_sub('1998-12-01', interval 108 day) group by     l_returnflag,     l_linestatus order by     l_returnflag,     l_linestatus;
Time(s) Original Optimized     Improvement
  11.48 10.68      
  10.96 10.5      
  11.18 10.66      
  11.16 10.67      
  11.18 10.51     AVG(Original) / AVG(Optimized) - 1.0
AVG 11.192 10.604   Optimized : Original 5.55%

Test memory compassion by string sort

  • limit cpu up to 500%
  • SQL
select max(l_comment) from lineitem;
Time(s) Original Optimized     Improvement
  7.71 7.05      
  7.75 6.96      
  7.95 7.07      
  7.61 7.23      
  7.83 7.07     AVG(Original) / AVG(Optimized) - 1.0
AVG 7.77 7.076   Optimized : Original 9.81%

MemUtils

Test memory copy

  • __folly_memcpy, benefited from avx2 instructions, performs better when string size is not small(>80 Bytes)
  • the new implementation of inline_memcpy is better than original one(from Clickhouse).
Time(ns) STL: (GNU libc) 2.17 Original Clickhouse Optimized TiFlash Folly Improvement: (STL) / (Optimized) - 1.0 Improvement: (Original) / (Optimized) - 1.0 Improvement: (Original) / (Folly) - 1.0
MemUtilsCopy_${min}_${max}_${align}_true_${loop}: generate 4095 string pairs with size [${min}, ${max}] and specific alignment, memory copy one by one for ${loop} times              
MemUtilsCopy_1_20_3_true_20000 352978 124102 109755 151166 221.61% 13.07% -17.90%
MemUtilsCopy_1_40_3_true_20000 388125 169980 122037 189678 218.04% 39.29% -10.38%
MemUtilsCopy_1_80_3_true_20000 423425 222338 215519 179408 96.47% 3.16% 23.93%
MemUtilsCopy_1_200_3_true_20000 567271 406959 368982 208478 53.74% 10.29% 95.20%
MemUtilsCopy_1_1000_3_true_20000 1021812 624687 576924 362513 77.11% 8.28% 72.32%

Test memory comparison

Time(ns) STL: (GNU libc) 2.17 Optimized-avx2 Improvement: (STL) / (Optimized) - 1.0
MemUtilsCmp_${str-size}_${loop_times}: check mem-cmp for str for specific times      
MemUtilsCmp_2_20 66.5 51.9 28.13%
MemUtilsCmp_13_20 75.3 66 14.09%
MemUtilsCmp_65_20 126 106 18.87%
MemUtilsCmp_100_20 167 106 57.55%
MemUtilsCmp_10000_20 5145 3740 37.57%
MemUtilsCmp_100000_20 81996 68577 19.57%
MemUtilsCmp_1000000_20 1254279 1112721 12.72%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 9, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • windtalker
  • zanmato1984

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 9, 2022
Signed-off-by: Zhigao Tong <tongzhigao@pingcap.com>
@solotzg
Copy link
Contributor Author

solotzg commented Sep 14, 2022

/run-all-tests

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 14, 2022

Coverage for changed files

Filename                                                     Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/AggregateFunctions/AggregateFunctionMinMaxAny.h         283               185    34.63%          86                50    41.86%         484               342    29.34%         174               129    25.86%
dbms/src/AggregateFunctions/IAggregateFunction.h                 171               135    21.05%          35                18    48.57%         226               182    19.47%         102                90    11.76%
dbms/src/Columns/ColumnString.cpp                                172                82    52.33%          26                12    53.85%         402               206    48.76%         118                66    44.07%
dbms/src/Columns/ColumnString.h                                   67                10    85.07%          32                 4    87.50%         167                28    83.23%          24                 5    79.17%
dbms/src/Columns/ColumnsCommon.cpp                                72                15    79.17%          17                 7    58.82%         158                29    81.65%          44                 8    81.82%
dbms/src/Common/ColumnsHashing.h                                 113                80    29.20%          23                13    43.48%         230               142    38.26%          52                44    15.38%
dbms/src/Common/ColumnsHashingImpl.h                              86                38    55.81%          24                13    45.83%         159                57    64.15%          30                13    56.67%
dbms/src/Functions/FunctionsComparison.h                         604               301    50.17%          63                28    55.56%         949               502    47.10%         476               282    40.76%
dbms/src/Functions/LeastGreatest.h                                25                 6    76.00%          10                 4    60.00%          50                12    76.00%          10                 2    80.00%
dbms/src/Interpreters/Aggregator.cpp                            2852              2062    27.70%          75                40    46.67%        1618               993    38.63%        1236               939    24.03%
dbms/src/Interpreters/Aggregator.h                              1025               849    17.17%          43                21    51.16%         205                84    59.02%         422               177    58.06%
dbms/src/Storages/Transaction/Collator.h                          20                 1    95.00%           8                 1    87.50%          34                 1    97.06%          14                 2    85.71%
dbms/src/Storages/Transaction/CollatorUtils.h                     34                 4    88.24%          11                 1    90.91%          68                 8    88.24%          14                 1    92.86%
dbms/src/Storages/Transaction/TiDB.cpp                           484               261    46.07%          46                13    71.74%         864               389    54.98%         336               144    57.14%
libs/libcommon/include/common/StringRef.h                         35                 6    82.86%          21                 5    76.19%          86                19    77.91%          10                 1    90.00%
libs/libcommon/include/common/avx2_mem_utils.h                   269                37    86.25%          15                 0   100.00%         274                11    95.99%         146                 2    98.63%
libs/libcommon/include/common/memcpy.h                             1                 0   100.00%           1                 0   100.00%           7                 3    57.14%           0                 0         -
libs/libcommon/include/common/sse2_memcpy.h                       46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 1    94.44%
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                           6359              4072    35.96%         537               230    57.17%        6092              3035    50.18%        3226              1906    40.92%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18756      8142             56.59%    217244  83760        61.44%

full coverage report (for internal network access only)

Copy link
Contributor

@windtalker windtalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 15, 2022
Signed-off-by: Zhigao Tong <tongzhigao@pingcap.com>
Copy link
Contributor

@zanmato1984 zanmato1984 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 16, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 16, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: ce367a6

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 16, 2022
@sre-bot
Copy link
Collaborator

sre-bot commented Sep 16, 2022

Coverage for changed files

Filename                                                     Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/AggregateFunctions/AggregateFunctionMinMaxAny.h         283               183    35.34%          86                49    43.02%         484               335    30.79%         174               127    27.01%
dbms/src/AggregateFunctions/IAggregateFunction.h                 171               135    21.05%          35                18    48.57%         226               182    19.47%         102                90    11.76%
dbms/src/Columns/ColumnString.cpp                                172                81    52.91%          26                12    53.85%         399               204    48.87%         118                65    44.92%
dbms/src/Columns/ColumnString.h                                   58                 7    87.93%          32                 3    90.62%         161                23    85.71%          18                 3    83.33%
dbms/src/Columns/ColumnsCommon.cpp                                72                15    79.17%          17                 7    58.82%         158                29    81.65%          44                 8    81.82%
dbms/src/Common/ColumnsHashing.h                                 113                80    29.20%          23                13    43.48%         230               142    38.26%          52                44    15.38%
dbms/src/Common/ColumnsHashingImpl.h                              86                37    56.98%          24                13    45.83%         159                52    67.30%          30                11    63.33%
dbms/src/Functions/FunctionsComparison.h                         604               302    50.00%          63                28    55.56%         949               503    47.00%         476               286    39.92%
dbms/src/Functions/LeastGreatest.h                                25                 6    76.00%          10                 4    60.00%          50                12    76.00%          10                 2    80.00%
dbms/src/Interpreters/Aggregator.cpp                            2852              2063    27.66%          75                40    46.67%        1618               995    38.50%        1236               939    24.03%
dbms/src/Interpreters/Aggregator.h                              1025               849    17.17%          43                21    51.16%         205                84    59.02%         422               177    58.06%
dbms/src/Storages/Transaction/Collator.h                          21                 1    95.24%           9                 1    88.89%          37                 1    97.30%          14                 2    85.71%
dbms/src/Storages/Transaction/CollatorCompare.h                   26                 4    84.62%           9                 1    88.89%          42                 5    88.10%          10                 1    90.00%
dbms/src/Storages/Transaction/CollatorUtils.h                      8                 0   100.00%           2                 0   100.00%          26                 3    88.46%           4                 0   100.00%
dbms/src/Storages/Transaction/TiDB.cpp                           484               261    46.07%          46                13    71.74%         864               389    54.98%         336               144    57.14%
libs/libcommon/include/common/StringRef.h                         36                 6    83.33%          22                 5    77.27%          89                19    78.65%          10                 1    90.00%
libs/libcommon/include/common/avx2_mem_utils.h                   269                37    86.25%          15                 0   100.00%         276                11    96.01%         146                 2    98.63%
libs/libcommon/include/common/mem_utils_opt.h                     14                 0   100.00%           3                 0   100.00%          44                21    52.27%           8                 0   100.00%
libs/libcommon/include/common/memcpy.h                             1                 0   100.00%           1                 0   100.00%           7                 3    57.14%           0                 0         -
libs/libcommon/include/common/sse2_memcpy.h                       46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 0   100.00%
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                           6366              4067    36.11%         542               228    57.93%        6135              3040    50.45%        3228              1902    41.08%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18828      8138             56.78%    218647  83667        61.73%

full coverage report (for internal network access only)

@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/unhold

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 16, 2022
@solotzg
Copy link
Contributor Author

solotzg commented Sep 16, 2022

/rebuild

@ti-chi-bot
Copy link
Member

@solotzg: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 16, 2022

Coverage for changed files

Filename                                                     Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/AggregateFunctions/AggregateFunctionMinMaxAny.h         283               184    34.98%          86                50    41.86%         484               342    29.34%         174               128    26.44%
dbms/src/AggregateFunctions/IAggregateFunction.h                 171               135    21.05%          35                18    48.57%         226               182    19.47%         102                90    11.76%
dbms/src/Columns/ColumnString.cpp                                172                81    52.91%          26                12    53.85%         399               204    48.87%         118                65    44.92%
dbms/src/Columns/ColumnString.h                                   58                 7    87.93%          32                 3    90.62%         161                23    85.71%          18                 3    83.33%
dbms/src/Columns/ColumnsCommon.cpp                                72                15    79.17%          17                 7    58.82%         158                29    81.65%          44                 8    81.82%
dbms/src/Common/ColumnsHashing.h                                 113                80    29.20%          23                13    43.48%         230               142    38.26%          52                44    15.38%
dbms/src/Common/ColumnsHashingImpl.h                              86                38    55.81%          24                13    45.83%         159                57    64.15%          30                13    56.67%
dbms/src/Functions/FunctionsComparison.h                         604               301    50.17%          63                28    55.56%         949               502    47.10%         476               284    40.34%
dbms/src/Functions/LeastGreatest.h                                25                 6    76.00%          10                 4    60.00%          50                12    76.00%          10                 2    80.00%
dbms/src/Interpreters/Aggregator.cpp                            2852              2063    27.66%          75                40    46.67%        1618               996    38.44%        1236               940    23.95%
dbms/src/Interpreters/Aggregator.h                              1025               849    17.17%          43                21    51.16%         205                84    59.02%         422               177    58.06%
dbms/src/Storages/Transaction/Collator.h                          21                 1    95.24%           9                 1    88.89%          37                 1    97.30%          14                 2    85.71%
dbms/src/Storages/Transaction/CollatorCompare.h                   26                 4    84.62%           9                 1    88.89%          42                 5    88.10%          10                 1    90.00%
dbms/src/Storages/Transaction/CollatorUtils.h                      8                 0   100.00%           2                 0   100.00%          26                 3    88.46%           4                 0   100.00%
dbms/src/Storages/Transaction/TiDB.cpp                           484               261    46.07%          46                13    71.74%         864               389    54.98%         336               144    57.14%
libs/libcommon/include/common/StringRef.h                         36                 7    80.56%          22                 5    77.27%          89                20    77.53%          10                 2    80.00%
libs/libcommon/include/common/avx2_mem_utils.h                   269                37    86.25%          15                 0   100.00%         276                11    96.01%         146                 2    98.63%
libs/libcommon/include/common/mem_utils_opt.h                     14                 0   100.00%           3                 0   100.00%          44                21    52.27%           8                 0   100.00%
libs/libcommon/include/common/memcpy.h                             1                 0   100.00%           1                 0   100.00%           7                 3    57.14%           0                 0         -
libs/libcommon/include/common/sse2_memcpy.h                       46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 1    94.44%
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                           6366              4069    36.08%         542               229    57.75%        6135              3053    50.24%        3228              1906    40.95%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18846      8119             56.92%    218885  83456        61.87%

full coverage report (for internal network access only)

@sre-bot
Copy link
Collaborator

sre-bot commented Sep 16, 2022

Coverage for changed files

Filename                                                     Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dbms/src/AggregateFunctions/AggregateFunctionMinMaxAny.h         283               185    34.63%          86                50    41.86%         484               342    29.34%         174               129    25.86%
dbms/src/AggregateFunctions/IAggregateFunction.h                 171               135    21.05%          35                18    48.57%         226               182    19.47%         102                89    12.75%
dbms/src/Columns/ColumnString.cpp                                172                81    52.91%          26                12    53.85%         399               204    48.87%         118                66    44.07%
dbms/src/Columns/ColumnString.h                                   58                 7    87.93%          32                 3    90.62%         161                23    85.71%          18                 3    83.33%
dbms/src/Columns/ColumnsCommon.cpp                                72                15    79.17%          17                 7    58.82%         158                29    81.65%          44                 8    81.82%
dbms/src/Common/ColumnsHashing.h                                 113                80    29.20%          23                13    43.48%         230               142    38.26%          52                44    15.38%
dbms/src/Common/ColumnsHashingImpl.h                              86                38    55.81%          24                13    45.83%         159                57    64.15%          30                13    56.67%
dbms/src/Functions/FunctionsComparison.h                         604               301    50.17%          63                28    55.56%         949               502    47.10%         476               284    40.34%
dbms/src/Functions/LeastGreatest.h                                25                 6    76.00%          10                 4    60.00%          50                12    76.00%          10                 2    80.00%
dbms/src/Interpreters/Aggregator.cpp                            2852              2063    27.66%          75                40    46.67%        1618               995    38.50%        1236               943    23.71%
dbms/src/Interpreters/Aggregator.h                              1025               849    17.17%          43                21    51.16%         205                84    59.02%         422               177    58.06%
dbms/src/Storages/Transaction/Collator.h                          21                 1    95.24%           9                 1    88.89%          37                 1    97.30%          14                 2    85.71%
dbms/src/Storages/Transaction/CollatorCompare.h                   26                 4    84.62%           9                 1    88.89%          42                 5    88.10%          10                 1    90.00%
dbms/src/Storages/Transaction/CollatorUtils.h                      8                 0   100.00%           2                 0   100.00%          26                 3    88.46%           4                 0   100.00%
dbms/src/Storages/Transaction/TiDB.cpp                           484               261    46.07%          46                13    71.74%         864               389    54.98%         336               144    57.14%
libs/libcommon/include/common/StringRef.h                         36                 7    80.56%          22                 5    77.27%          89                20    77.53%          10                 2    80.00%
libs/libcommon/include/common/avx2_mem_utils.h                   269                37    86.25%          15                 0   100.00%         276                11    96.01%         146                 2    98.63%
libs/libcommon/include/common/mem_utils_opt.h                     14                 0   100.00%           3                 0   100.00%          44                21    52.27%           8                 0   100.00%
libs/libcommon/include/common/memcpy.h                             1                 0   100.00%           1                 0   100.00%           7                 3    57.14%           0                 0         -
libs/libcommon/include/common/sse2_memcpy.h                       46                 0   100.00%           1                 0   100.00%         111                27    75.68%          18                 1    94.44%
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                           6366              4070    36.07%         542               229    57.75%        6135              3052    50.25%        3228              1910    40.83%

Coverage summary

Functions  MissedFunctions  Executed  Lines   MissedLines  Cover
18849      8119             56.93%    218921  83495        61.86%

full coverage report (for internal network access only)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants